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This  paper  addresses  the  problem  of  goal-directed  robot  path 
planning  in  the  presence  of  uncertainties  that  are  induced  by 
bounded  environmental  disturbances  and  actuation  errors.  The 
offline  infinite -horizon  optimal  plan  is  locally  updated  by  online 
finite-horizon  adaptive  replanning  upon  observation  of  unexpected 
events  (e.g.,  detection  of  unanticipated  obstacles).  The  underlying 
theory  is  developed  as  an  extension  of  a  grid-based  path  planning 
algorithm,  called  v*.  which  was  formulated  in  the  framework  of 
probabilistic  finite  state  automata  (PFSA)  and  language  measure 
from  a  control-theoretic  perspective.  The  proposed  concept  has 
been  validated  on  a  simulation  te.st  bed  that  is  constructed  upon  a 
model  of  typical  autonomous  iinderw’ater  vehicles  (AUVs)  in  the 
presence  of  uncertainties.  [DOI:  10.1115/1.4027876] 
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1  Motivation  and  Introduction 

In  general,  path  planning  of  robots  (e.g.,  AUVs)  and  unmanned 
aerial  vehicles  (UAVs))  aims  to  optimize  either  travel  time, 
energy  usage,  or  safety  of  operations.  Recently,  much  work  has 
been  reported  for  path  planning  in  the  presence  of  environmental 
uncertainties.  A  few  examples  follow. 

Garau  et  al.  [1]  used  the  A*  algorithm  for  path  planning  of 
AUVs  by  taking  the  effects  of  ocean  currents  into  consideration. 
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Petres  et  al.  [2]  reported  path  planning  of  AUVs,  in  which  a  fast 
marching  algorithm  was  used  to  model  the  effects  of  ocean  cur¬ 
rents  as  an  anisotropic  cost  function.  Rhoads  et  al.  [3]  solved  a 
dynamic  Hamilton  Jacobi  Bellman  equation  to  obtain  the  optimal 
“time-to-go”  with  a  discontinuous  and  dynamic  cost  function  for 
path  planning  in  the  presence  of  ocean  currents.  Recently,  Lolla 
et  al.  [4]  developed  a  methodology  for  AUV  path  planning,  in 
which  the  concept  of  flow  advection  was  combined  with  nominal 
vehicle  motion  until  the  desired  goal  was  reached.  Majumdar  and 
Tedrake  [5]  generated  libraries  in  the  offline  precomputation  stage 
for  online  robust  motion  planning,  where  some  of  the  library 
members  were  regenerated  for  collision  avoidance.  Blackmore 
et  al.  [6]  designed  a  chance-constrained  predictive  stochastic  con¬ 
troller  to  ensure  that  the  probability  of  failure  (e.g.,  collision)  is 
less  than  a  specified  threshold  even  in  the  presence  of  randomly 
varying  uncertainties.  Chakravorty  and  Kumar  [7]  used  the  con¬ 
cepts  of  probabilistic  roadmaps  and  rapidly  exploring  random  trees 
[8]  to  constmct  feedback  controllers  for  robust  planning  in  the  pres¬ 
ence  of  motion  uncertainties.  However,  in  many  path  planning 
algorithms  reported  in  literature,  the  offline  computation  does  not 
take  into  account  potential  runtime  constraints  (e.g.,  unanticipated 
obstacles),  and  online  adaptation  is  computationally  infeasible. 

Robots  (e.g.,  AUVs)  are  expected  to  operate  in  environments  with 
external  disturbances  and  hence,  considering  the  effects  of  these  dis¬ 
turbances  is  critical  for  mission  success.  Consequently,  the  algo¬ 
rithms  of  path  planning  must  be  capable  of  real-time  execution  on  in 
situ  computational  platforms.  To  meet  these  challenges,  Chattopad- 
hyay  et  al.  [9]  formulated  the  v*  algorithm  in  the  framework  of 
PFSA  from  a  control-theoretic  perspective.  The  v*  algorithm  per¬ 
forms  robot  path  planning  for  a  specified  goal  by  optimizing  the  lan¬ 
guage  measure  of  the  PFSA  model  [10,11]  that  represents  the 
workspace  for  the  robot.  In  this  setting,  optimal  path  planning  is 
equivalent  to  maximization  of  a  PFSA’s  performance,  based  on  a 
quantitative  measure  of  probabilistic  regular  languages.  This  concept 
of  robot  path  planning  jointly  maximizes  the  probability  of  reaching 
the  target  and  the  probability  of  staying  clear  of  the  obstacles.  Such  a 
notion  of  goal-directed  safe  (e.g.,  Pareto  optimization  [12]  of  reach¬ 
ing  the  goal  and  collision  avoidance)  navigation  of  autonomous 
vehicles  is  particularly  important  in  the  context  of  planning  in  the 
presence  of  disturbances,  especially  to  ensure  collision-free  naviga¬ 
tion.  In  this  case,  Markov  decision  process  (MDP)  tools  may  not  be 
suitable  because  of  the  requirement  of  robust  adaptation  in  real  time 
[5].  The  situation  becomes  worse  for  lack  of  observability,  because 
the  formulation  of  a  partially  observable  MDP  would  become  com¬ 
putationally  intractable  for  a  real-time  solution. 

Language-measure-theoretic  path  planning  offers  an  inherent 
advantage  of  global  monotonicity  [9]  in  the  sense  that  the  solution 
iterations  are  finite  and  that  a  sequence  of  final  waypoints  is  gen¬ 
erated.  The  PFSA,  constructed  out  of  the  robot’s  workspace,  is 
optimized  via  an  iterative  sequence  of  combinatorial  operations  to 
maximize  the  language  measure  vector  elementwise.  Neverthe¬ 
less,  the  V*  algorithm  is  different  from  a  conventional  search  algo¬ 
rithm  in  the  sense  that  it  automatically  generates  a  measure 
gradient  (e.g.,  no  potential  function  required),  which  is  maximized 
at  the  goal.  Furthermore,  the  time  complexity  of  each  iteration 
step  is  linear  relative  to  the  problem  size  (e.g.,  the  dimension  of 
the  discretized  state  space),  and  computation  of  the  measure  vec¬ 
tor  in  a  distributed  fashion  makes  the  algorithm  especially  suitable 
for  high-dimensional  planning.  These  advantages  render  the  pre¬ 
sented  method  potentially  suitable  for  (offline)  planning  and 
(online)  adaptation  in  the  presence  of  uncertainties. 

As  an  extension  of  the  v*  algorithm  [9],  this  paper  formulates 
an  algorithm  of  grid-based  robot  path  planning,  which  has  been 
validated  on  a  simulation  test  bed  of  AUVs  in  the  presence  of 
uncertain  ocean  currents  and  actuation  errors.  Major  contributions 
of  the  paper  beyond  the  work  on  the  v*  algorithm,  reported  by 
Chattopadhyay  et  al.  [9],  are  outlined  below: 

•  The  notion  of  uncertainty  is  associated  with  the  uncontrol¬ 
lable  transitions  in  a  controlled  autonomous  system.  This 
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concept  is  different  from  the  common  practice  of  superim¬ 
posing  the  effects  of  disturbances  on  the  planned  trajectory 
of  a  robotic  system. 

•  The  offline  infinite-horizon  optimal  plan  is  locally  updated 
by  online  finite -horizon  adaptive  replanning  upon  observa¬ 
tion  of  unexpected  events  (e.g.,  detection  of  unanticipated 
obstacles). 

2  Brief  Review  of  Language  Measure  Theory 

This  section  summarizes  the  concept  of  (signed  real)  language 
measure  [10]  of  PFSA  [13]  and  the  role  of  language  measure  for 
optimal  control  [11]  of  PFSA.  While  the  theories  of  language 
measure  and  the  associated  optimal  control  are  developed  in  Refs. 
[10]  and  [11],  this  section  introduces  pertinent  definitions  and 
summarizes  the  essential  concepts  that  are  used  in  the  sequel. 

Definition  2.1.  (Alphabet  and  Word)  An  alphabet  S  is  a  non¬ 
empty  finite  set  of  symbols.  A  finite-length  string  of  symbols  from 
S  is  called  a  word  and  the  empty  word  (i.e.,  a  string  of  no  sym¬ 
bols)  is  denoted  as  e.  A  word  w  of  i  symbols  has  length  Iwl  =  £ 
and  |e|  =  0.  The  set  of  all  words  (including  t)  is  denoted  as  S*,- 
the  set  of  all  words  of  length  £  is  denoted  as  and  the  set  of  all 
infinite  strings  of  symbols  is  denoted  as  S“. 

Referring  to  Definition  2.1,  the  cardinalities  of  the  following 
sets  are  noted  below: 

•  |S|  G  N,  where  N  is  the  set  of  positive  integers. 

•  |S^|  =  |S|^  for  any  positive  integer  £. 

•  |S*|  =  |N|,  i.e.,  S*  is  countably  infinite  (Kq)- 

•  |S“|  =  |R|,  i.e.,  S“  is  uncountably  inhnite  (continuum). 

Definition  2.2  (Prefix  and  Suffix)  Let  Z  be  an  alphabet.  The 
operation  of  concatenation  of  two  words  is  closed  on  S*,  i.e.,  if 
w  =  xy,  then  w  E  Z*  'ix,y  E  Z*;  and  c  is  the  identity  element  of 
the  concatenation  monoid  [14].  In  that  case,  x  is  called  a  prefix  of 
w,  and  y  is  called  a  suffix  ofw. 

Definition  2.3.  A  PFSA  over  an  alphabet  Z  is  a  quintuple 

G  =  where 


•  Q  is  the  nonempty  finite  set  of  states,  i.e.,  \Q\  G  N; 

•  S:  Q  xX*  Q  is  the  transition  map  that  satisfies  the  follow¬ 
ing  conditions:  Vi?  E  2  Vi  E  Z  Vw’  E  Z* 

S(q,  e)  =  q;  and  S(q,  ws)  =  b(b(q,  w),  s). 

•  n:  Q  xX*  [0,  1]  is  the  morph  function  of  state-.specific 
symbol  generation  probabilities,  which  satisfies  the  follow¬ 
ing  conditions:  Vi?  E  2  Vi  E  Z  Vw’  E  Z* 

tt(q,  s)  >  0;  XiiES  ■*)  =  I.'  and 

n(q,  e)  =  l;  n(q,  ws)  =  n(q,  w)  x  n(S(q,  w),  s). 

•  Z  •  2  ^  [~li  1]'^'  w  the  vector-valued  characteristic  func¬ 
tion  that  assigns  a  signed  (normalized)  real  weight  to  each 
state. 

The  (|2|  X  |2|)  state  transition  probability  matrix  P  is  defined  as 
P=[Pik],  where  Py*  =  ^  ffiqj,s)'iqj,qk  e  Q 

SC'L-.  b(qj,s)=qt 

Note:  P  is  a  non-negative  stochastic  matrix  [15]. 

Remark  2.1.  The  restrictions  (5|gx5;  2  ^nd  ttlgxj;  ^  [0, 1]  of 

the  functions  <5  and  n  in  Definition  2.3  can  be  directly  obtained 
from  the  local  structure  of  the  PFSA.  The  unrestricted  functions  8 
and  71  can  then  be  computed  over  the  entire  domain  by  using  the 
recursive  relations  listed  in  Definition  2.3.  Physical  interpretations 
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of  X  are  briefly  provided  in  Subsection  2.1,  while  the  details  are 
reported  in  Detinition  5  of  Ref.  [10]. 

Definition  2.4.  (Language)  Let  {Q,'L,8,n,x)  he  a  PFSA. 
The  language  generated  by  all  words,  which  terminates  at  a  state 
qk  £  2  ofter  starting  from  qj  E  Q,  is  defined  as 

P{di,4k)  =  {w  G  Z*  :  &{qj,w)  =  <?*}  (1) 

The  language  generated  by  all  words,  which  may  terminate  at  any 
state  q  ^  Q  after  starting  from  qj  E  Q,  is  defined  as 


Li4j)  =  [jL{cij,q)  (2) 

q^Q 

Given  a  PFSA  (2,  £,  8,  n,  /),  the  following  facts  are  considered 
before  the  notion  of  a  measure  is  introduced  on  the  language  L 
(see  Definition  2.4). 

•  The  language  L  is  an  at  most  countable  (i.e.,  Hnite  or  count¬ 
ably  infinite)  set  because  L  C  Z*.  Therefore,  the  power  set 
if  is  either  finite  or  uncountable;  if  uncountable,  the  cardin¬ 
ality  of  if  is  the  same  as  that  of  R  or  that  of  the  standard 
B  Orel  algebra  £?(R)  [16]. 

•  A  measurable  space  (L,  Z“2^)  can  be  constructed  where  ev¬ 
ery  singleton  set  of  a  semi-intinite  string  of  symbols,  whose 
suffix  is  a  word  in  L,  is  a  measurable  set  [10,16].  For  brevity, 
the  f7-algebra  Z“2^  is  denoted  as  if,  because  every  word 
w  E  L  is  prefixed  by  a  semi-infinite  string  of  symbols.  That 
is,  w  represents  xw  and  jw)  E  2^  implies  {.m’j  E  Z'"2^, 
where  x  EZ“  is  arbitrary. 

Definition  2.5.  (Language  Measure)  Let  L(qj,  q^)  and  L(qj)  be 
languages  on  a  PFSA  (2,  2, 8,  n,  x)  and  let  6  E:  (0, 1 )  be  a  param¬ 
eter.  A  signed  real  measure  fJg  :  iLu^^k)  (that  satisfies  the 
requisite  axioms  of  measure  [16])  is  defined  as 

Pe(P(‘lj,4k))  =  (3) 

weL{qj,qt) 

The  measure  of  the  language  L(qj)  is  defined  as 

^(^(V;))  =  H  Fo  4k))  (4) 

qkCQ 

Following  Definition  2.5,  the  set  of  language  measures  for  a 
PFSA  is  interpreted  as  a  real-valued  vector  of  dimension  \Q\  and 
is  denoted  as  vg  whose  jth  component  is  izj)  G  R  corresponding  to 
the  state  qj  E  Q.  Following  Theorem  1  in  Ref.  [10],  the  language 
measure  of  the  PFSA  {Q,  Z,  8,  n,  x)  in  Eq.  (2.5)  is  expressed 
vectorially  as 

v«  =  0[I-(l-0)P]-'x  (5) 

where  P  is  the  state  transition  matrix  (see  Detinition  2.3)  and  the 
inverse  on  the  right  side  exists  for  all  6  E  (0,  1).  Furthermore,  as 
9  0^,  the  matrix  0[I  —  (1  —  0)P]  *  converges  to  the  Cesaro 

matrix  'P  =  limj-^oo  |  P'.  Then,  the  limiting  measure  vector 

Vo  is  obtained  as  [11] 

Vo  =  lim  Vo  =  lim  6[I  —  (1  —  6)P]  *z  =  fx  (6) 

e^o+  0^0+  II'- 

where  I  is  the  (|2|  x  \Q\)  identity  matrix. 
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2.1  Control  Perspectives.  From  the  perspectives  of  discrete- 
event  optimal  control  [11],  the  limiting  language  measure  vector 
Vo  in  Eq.  (6)  is  physically  interpreted  as  follows.  The  language 
measure  of  a  symbol  string  starting  at  a  state  qj  £  2  is  interpreted 
to  be  the  product  of  the  morph  probability  (conditioned  on  the 
state  qj)  and  the  characteristic  weight  Xk  of  the  terminating  state 
qi^.  For  example,  if  q^.  is  a  desirable  state  (e.g.,  goal)  to  terminate, 
then  Xk  should  be  assigned  to  be  positive;  similarly,  x^k  should  be 
negative  for  an  undesirable  terminating  state  (e.g.,  obstacle)  qj^. 
Thus,  the  characteristic  weights  are  assigned  to  represent  the  con¬ 
trol  specifications  (i.e.,  larger  positive  weights  to  more  desirable 
states)  and  the  language  measure  represents  the  goodness  of  the 
particular  string  relative  to  the  given  specification  and  the  PFSA 
model  [10].  In  the  sense  of  Definition  2.5,  the  limiting  language 
measure  sums  up  to  the  limiting  measures  of  each  string  start¬ 
ing  from  state  qj.  Thus,  captures  the  goodness  of  qj  based  on 
not  only  its  own  characteristic  weight  but  also  on  how  good  are 
the  strings  that  will  be  generated  from  qj  in  the  future.  In  essence, 
the  language  measure  provides  a  quantitative  comparison  of  via¬ 
ble  control  policies  [11]. 

Definition  2.6.  (Control  philosophy)  Let  qj  — >  qk  he  the  state 
transition  under  occurrence  of  the  event  s.  If  the  event  s  is  dis¬ 
abled  at  qj  hy  a  supervisory  action,  then  this  state  transition  from 
qj  to  qk  will  he  prevented  hy  forcing  the  plant  to  stay  at  the  origi¬ 
nal  state  qj.  Thus,  disabling  an  event  s  at  a  given  state  q  results  in 
deletion  of  the  original  transition  and  appearance  of  the  self-loop 
S(q,  s)  =  q  with  the  probability  of  occurrence  of  s  at  state  q 
remaining  unchanged  in  the  supervised  plant  and  also  in  the  unsu¬ 
pervised  plant. 

Definition  2.7.  (Controllable  transitions)  For  a  given  plant, 
transitions  that  can  he  disabled  in  the  sense  of  Definition  2.6  are 
defined  to  he  controllable:  a  transition  that  is  not  controllable  is 
called  uncontrollable .  The  set  of  controllable  transitions  in  a 
plant  is  denoted  as  C. 

If  the  supervisor  is  allowed  to  disable  all  subsets  (including  the 
empty  set  0  and  the  set  C  itself)  of  the  set  C  of  controllable  transi¬ 
tions,  then  there  exists  a  bijection  between  the  set  of  all  possible 
supervision  policies  and  the  power  set  2^;  in  other  words,  there 
are  2l*'l  possible  supervisors  and  each  supervisor  is  uniquely  iden¬ 
tifiable  with  a  subset  of  C.  Thus,  the  language  measure  v  allows  a 
quantitative  comparison  of  different  policies. 

Definition  2.8.  (Controlled  PFSA  and  optimal  measure  v*)  An 

optimally  controlled  (or  supervised)  PFSA  S=  {Q,'L,S,n,x,C) 
is  a  sextuple  that  maximizes  its  language  measure  Vo  by  disabling 
a  selected  subset  ofC.  The  optimized  measure  vq  is  denoted  as  v*. 

3  Description  of  the  Simulation  Test  Bed 

This  section  describes  the  simulation  test  bed  that  is  built  upon 
a  rigid-body  dynamic  model  of  the  planar  motion  of  a  generic 
AUV  [17].  The  rationale  for  adopting  such  a  simple  model  is  that 
the  usage  of  the  proposed  language-measure-theoretic  concepts 
can  be  unambiguously  presented  to  explain  how  to  control  the 
AUV  motion  in  the  presence  of  environmental  disturbances  and 
actuation  errors.  Nevertheless,  the  theory  is  applicable  on  a 
dynamically  feasible  graph  with  any  degree  of  details. 

The  model  takes  into  account  the  effects  of  yaw  (i/i)  and  the  ve¬ 
locity  vector  of  surge  rate  (u),  sway  rate  (v),  and  yaw  rate  (r),  i.e., 

V  =  [  M  V  but  the  effects  of  heave  (z),  roll  (cj))  and  pitch  (6) 
are  assumed  to  be  negligible  [17].  The  resulting  equations  of 
motion  are  obtained  as 

Mv  +  C(v)v  +  D(v)v  =  T  and  ij/  =  r  (7) 

where  M  =  is  the  inertia  matrix;  C(v)  =  — C(v)^  is  the  Corio¬ 
lis  centripetal  matrix;  D(v)  is  the  damping  matrix;  and 
T  =  [  T„  Tv  T,.  is  the  vector  of  the  system  inputs  that  represent 
the  forces  generated  by  the  on-board  actuators  as  well  as  those 
produced  by  environmental  disturbances  (e.g.,  ocean  current).  In 


this  model,  control  actions  are  generated  as  functions  of  the  surge 
rate  (u)  and  yaw  rate  (;•)  only,  i.e.^  the  vehicle  control  input  is  re¬ 
stricted  to  be  Tcon  =  [  tu  0  T,.  ]  .  Surge  and  sway  components 

of  the  ocean  current  vector  Vc  are  modeled  in  terms  of  its  direction 
and  yaw  angle  i//  as 

Uc  =  |vc|  cos(/?  -  i//)  and  ty  =  |vc|  sm{fi  ~  i/i)  (8) 

Assuming  that  the  vehicle’s  on-board  controller  maintains  the 
desired  yaw  angle  i/i  during  its  course  of  motion  (i.e.,  letting 
{jj  =  r  =  0),  the  dynamic  motion  model  is  further  simplified  to 
obtain  the  travel  distance  F  and  G  along  the  surge  and  sway  direc¬ 
tions  [17],  respectively,  as 


F(t„,m,.)  = 


u[t)dt  and  G(vc)  = 


v{t)dt 


(9) 


4  Problem  Formulation 

This  section  formulates  the  problem  of  path  planning  for  planar 
motion  of  robots  in  the  presence  of  uncertainties  induced  by  envi¬ 
ronmental  disturbances,  where  the  workspace  is  partitioned  as  a  fi¬ 
nite  number  of  cells.  Major  assumptions  in  the  problem 
formulation  are  listed  below: 


(1)  Partial  knowledge  of  the  static  obstacles. 

(2)  No  moving  obstacles. 

(3)  Partitioning  of  the  workspace  into  cells  as  a  regular  grid. 

(4)  Bounded  disturbances  and  actuation  errors  with  the  errors 
of  PFSA  states  being  limited  to  neighboring  cells  of  the  ori¬ 
gin  cell. 

Depending  on  the  resolution  of  planning,  a  cell  may  represent  a 
single  location  or  a  subregion  in  the  planning  environment.  The 
neighborhood  of  a  cell  is  defined  by  taking  into  account  feasible 
dynamics  of  transitions  to  that  cell. 

The  robot  motion  is  modeled  as  controllable  transitions  among 
the  cells  as  the  robot  chooses  to  execute  its  moves;  such  moves 
may  depend  on  the  fidelity  of  motion  discretization  and  the 
robot’s  intrinsic  dynamics.  Based  on  the  specifications  of  intercell 
transitions,  each  cell  in  the  partitioned  workspace  is  modeled  as  a 
PFSA  state,  where  controllable  transitions  are  defined  by  the  cor¬ 
responding  state  transition  map. 

Let  a  PFSA  Gnav  =  {Q,  2,  <5,  n,  x)  (see  Definition  2.3)  repre¬ 
sent  the  navigation  model  of  a  robot,  where  the  states  in  Q  are  cell 
locations  in  the  original  workspace  and  the  alphabet  size  |Z|  is 
determined  from  the  number  of  events  that  may  cause  controllable 
transitions.  For  simplicity  of  exposition,  this  paper  considers  a 
planar  robot  that  is  allowed  to  freely  rotate  about  its  geometric 
center,  implying  that  |Z|  =  9.  As  an  example,  let  the  robot  be  in 
the  state  q  (see  Fig.  1)  and  then  let  the  robot  make  a  choice  to 
move  to  one  of  the  states  labeled  1  through  8  or  it  may  stay  at  q. 
While  the  state  transition  map  3  (restricted  to  2  x  S)  is  con¬ 
structed  in  the  neighborhood  of  a  PFSA  state  and  the  morph  func¬ 
tion  n  (restricted  to  Qx'L)  are  constructed  under  the  condition 
that,  with  negligible  state  estimation  error,  the  robot  can  choose 
any  controllable  transition  to  execute  at  a  cell.  Hence,  all  control¬ 
lable  transitions  at  a  cell  are  assumed  to  be  equally  likely  for  the 
unsupervised  automaton  Gnav-  The  vector-valued  characteristic 
function  x  (see  Definition  2.3)  is  chosen  based  on  a  priori  known 
goodness  of  individual  states,  explained  in  Subsection  2.1.  Since 
the  environment  map  could  be  only  partially  known,  the  robot 
must  adapt  to  the  unforeseen  obstacles  in  real  time. 

Uncertain  (e.g.,  fault-induced  and  environmental)  disturbances 
may  cause  uncontrollable  transitions  in  the  robot  motion  as  seen 
in  the  example  of  Fig.  1,  where  the  robot  may  erroneously  move 
to  an  unplanned  state,  thus  interfering  with  control  actions  of 
robot  path  planning  (see  Assumption  (4)  at  the  beginning  of  this 
section).  Therefore,  probabilistic  decisions  of  path  planning  need 
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to  be  made  based  on  the  set  of  the  available  controllable  actions 
of  the  robot  to  mitigate  the  effects  of  these  disturbances.  In  this 
model,  the  state  transition  probabilities  serve  as  a  discretized  rep¬ 
resentation  of  the  effects  of  disturbances  on  the  robot  dynamics, 
where  the  transition  probability  represents  the  proba¬ 

bility  of  reaching  state  qj  after  executing  an  action  s  from  state  qj. 
Using  the  simplihed  model  developed  in  the  last  section,  the 
vehicle’s  relative  position  in  the  next  time  step  (from  the  previous 
position  after  it  makes  a  move  j-  E  S  in  the  global  Cartesian  coor¬ 
dinate  frame)  is  expressed  as 

Ar  =  Fcos(i//)  —  G  sin(i//)  (10) 

Ay  =  Fsin(i//)  +  Gcos(i//)  (11) 

The  corresponding  transition  probabilities  P  under  the  specihed 
simplifying  assumptions  are  then  computed  in  terms  of  the  proba¬ 
bilities  V  as 

■s)  =  P([A.r,  Ay]  e  qi\qi,s)  (12) 

In  the  absence  of  disturbances,  P(^,|(5',  ,  i)  is  either  0  or  I  such  that 
P((5(^,-,i)|^,-,^)  =  1  and  P(i?;|(?,-, i)  =  0  Vi?,-  £  Q\8{qi,s). 

While  the  PFSA  Gnav  specifies  the  discretized  motion  of  the 
planar  robot,  the  matrix  P  contains  information  about  the  effects 
of  the  disturbances  on  its  motion.  With  the  knowledge  of  Gnav 
and  P,  the  path  planning  problem  is  reduced  to  the  optimal  super¬ 
vision  problem  of  Gnav  (in  the  sense  of  Definition  2.8)  in  the 
presence  of  disturbances  specified  by  P.  Detection  of  unforeseen 
obstacles  during  the  robot  operation  may  result  in  a  change  of  the 
characteristic  weight  vector  %.  In  this  context,  new  obstacles  are 
considered  as  undesirable  terminating  states  and,  therefore,  are 
assigned  negative  characteristic  weights  (see  Subsection  2.1). 

5  Solution  Approach 

Formulation  of  the  path  planning  problem  as  a  PFSA-based 
navigation  allows  computation  of  optimally  feasible  paths  via  the 
language-measure-theoretic  optimization  scheme.  For  the  unsu¬ 
pervised  model,  the  robot  is  free  to  choose  from  any  of  the  defined 
controllable  events  from  any  cell  in  the  workspace. 

The  optimization  algorithm,  presented  as  Algorithm  1,  selec¬ 
tively  disables  controllable  transitions  to  ensure  that  the  measure 
vector  of  the  navigation  automaton  is  elementwise  maximized. 
Selected  controllable  transitions  are  disabled  to  optimize  the  state 
transition  probability  matrix  P  (see  Definition  2.3).  This  is  accom¬ 
plished  by  maximizing  the  limiting  measure  vector  (see  Eq.  (6)) 
in  the  sense  of  Definition  2.8,  which  corresponds  to  the  optimally 
supervised  PFSA  model  for  the  robot.  This  implies  that  the  super¬ 
vised  robot  is  constrained  only  to  choose  among  the  enabled 
moves  at  each  state  such  that  a  Pareto  trade-off  is  achieved 
between  minimization  of  the  collision  probability  and  simultane¬ 
ous  maximization  of  the  goal-reaching  probability.  An  important 
point  to  note  is  that  even  though  the  measure  values  are  based  on 
optimization  of  a  PFSA,  an  optimal  and  feasible  plan  is  obtained 
that  can  be  executed  in  a  deterministic  sense. 

For  planning  in  the  presence  of  environmental  disturbances,  it 
is  critical  to  consider  the  robot’s  vulnerability  to  uncertainties  cre¬ 
ated  by  such  disturbances  (see  Fig.  1).  In  the  work  presented  in 
Refs.  [9]  and  [II],  the  measure  vector  was  optimized  by  ignoring 
the  uncontrollable  transitions  arising  from  the  uncertainties.  How¬ 
ever,  ignoring  the  uncontrollable  transitions  in  the  presence  of  dis¬ 
turbances  could  be  fatal  for  autonomous  robots.  In  this  paper,  the 
effects  of  the  uncontrollable  transitions  are  taken  into  considera¬ 
tion  while  optimizing  the  measure  vector  of  the  navigation  autom¬ 
aton.  In  this  framework,  a  combinatorial  sequence  of  enabling  and 
disabling  of  transitions  optimizes  a  weighted  combination  of  mini¬ 
mizing  the  probability  of  collision  and  maximizing  the  probability 
of  reaching  the  goal  in  the  presence  of  known  environmental  dis¬ 
turbances.  Depending  on  the  strength  of  environmental 
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Fig.  1  Illustration  of  the  effects  of  uncontrollable  transitions 
due  to  the  disturbances.  It  is  shown  how  uncontrollable  transi¬ 
tions  may  cause  the  robot  to  end  up  in  a  different  neighboring 
state  of  the  origin  cell,  thus  interfering  with  control  actions. 

disturbances,  the  controller  attempts  to  make  an  optimal  trade-off 
between  performance  (i.e.,  path  length)  and  robustness  to 
uncertainties. 

With  Monte  Carlo  simulation  of  the  disturbances  on  the  test 
bed,  conditional  probabilities  for  the  robot  (i.e.,  AUV)  to  termi¬ 
nate  at  a  neighboring  state  qj  are  generated  as 

such  that  ^  =  1  (13) 

qjeNigi) 

where  qj  E  2  is  a  neighboring  state  of  q^  and  i  E  S  represents  the 
control  action. 

Let  Gnav  =  {Q,  2,  <5,  n,  x)  be  the  unsupervised  navigation 
automaton.  From  the  knowledge  of  the  morph  probability  n  and 
the  conditional  probability  distribution  P  due  to  the  environmen¬ 
tal  disturbances,  a  family  of  probability  matrices,  Pg('i'J)i 
j  =  1,2,  •  ■  ■ ,  |W(<?,)|  is  constructed,  where  \N{qi)\  is  cardinality  of 
the  set  Niqf)  of  neighborhood  states  (including  itself)  of  the  state 
qi  (e.g.,  |A^(^,)|  =  9  for  any  internal  state  of  a  planar  robot), 
defined  as 

PGiqhS,qj)  =  P{qj\qi,s)  X  n{qi,s)  Mqi  e  Q  (14) 

where  i  E  S  is  a  control  action  and  3(qi,  i)  =  qj  is  a  neighboring 
state  (including  itself)  of  qj.  (used  in  line  24  of  Algorithm  1) 
is  a  local  measure  vector  of  state  qj  and  consists  of  the  measure 
values  of  the  states  qj  E  N(q,).  The  transition  matrix  Pq  is  opti¬ 
mized  in  an  iterative  fashion  by  Algorithm  1  such  that  the  strings, 
which  are  likely  to  lead  to  collision  in  the  presence  of  the  distur¬ 
bances,  are  disabled.  At  the  same  time,  feasible  paths  to  the  goal 
are  retained  so  that  the  measure  vector  is  elementwise  maximized. 

For  path  planning  in  a  large-dimensional  state  space,  a  distrib¬ 
uted  iterative  scheme  is  recommended  for  the  computation  of  the 
language  measure,  which  was  reported  in  an  earlier  publication 
[18].  The  algorithm  of  optimized  language  measure  vector  is  pre¬ 
sented  as  Algorithm  1 . 

Salient  properties  of  Algorithm  1  are  delineated  below: 

•  Computation  of  the  uncertain  behavior  of  the  robot’s  motion 
caused  by  environmental  disturbances:  The  sequence  of  dis¬ 
abling  and  enabling  actions  at  the  robot’s  discretion  is  gen¬ 
erated  by  (elementwise)  maximization  of  the  language 
measure  vector.  To  relax  the  assumption  of  restriction  of 
uncontrollable  transitions  to  the  neighboring  states,  a  proba¬ 
bility  distribution  over  the  entire  workspace  could  be  consid¬ 
ered  instead  of  using  a  local  measure  vector  in  line  24  of 
Algorithm  1.  This  is  also  true  in  trajectory  planning  for 

Transactions  of  the  ASME 


Downloaded  From:  http://dynamicsystems.asmedigitalcollection.asme.org/  on  10/21/2014  Terms  of  Use:  http://asme.org/terms 


(a) 


(c) 


Fig.  2  Optimal  paths  under  different  directions  of  the  ocean  current.  (The  vehicle  is  con¬ 
strained  to  stay  in  the  box  and  not  hit  the  walls.) 


X-coordinate 

Fig.  3  Real-time  replanning  over  a  finite  time  horizon  T 

UAVs,  where  the  environmental  disturbances  may  require 
widening  of  its  flight  envelope. 

•  Generation  of  the  optimized  measure  vector  v*  for  robot 
navigation  based  on  its  PFSA  model,  along  with  the  associ¬ 
ated  optimized  outputs  n*  and  Pq.  The  path  to  the  goal  from 
arbitrary  initial  locations  is  obtained  by  following  the  gradi¬ 
ent  of  the  measure  vector. 

Real-time  obstacle  avoidance  in  the  presence  of  disturbances  is 
a  difficult  problem  due  to  the  disturbance-induced  uncontrollable 
transitions.  An  online  replanning  algorithm  must  propagate  the 
motion  model  in  time  and  make  sure  that  the  robot  is  clear  of  the 
obstacle.  In  this  regard,  one  may  take  advantage  of  the  fact  that 
the  stochastic  navigation  matrix  (i.e.,  the  optimized  state  transition 
matrix  of  the  controlled  PFSA)  contains  the  information  on  the 
disturbances.  To  make  a  local  adaptation,  the  original  infinite- 
horizon  measure  vector  is  augmented  with  the  information  on 
recently  discovered  obstacles.  The  effects  of  this  obstacle  are  con¬ 
sidered  on  a  finite-time  horizon  of  replanning  by  using  the  already 
optimized  stochastic  navigation  matrix  Pq.  Flowever,  as  the  sto¬ 
chastic  matrix  was  already  optimized  by  considering  the  effects  of 
disturbances,  the  replanning  keeps  the  vehicle  clear  of  the  obstacle 
under  such  disturbances.  The  quality  of  adaptation  is  expected  to 
improve  with  an  increase  in  the  replanning  time  horizon.  Hence,  a 
choice  of  the  replanning  time  horizon  (T)  depends  on  the  trade-off 
between  the  computation  time  and  quality  of  the  solution.  The 
steps  involved  in  replanning  are  succinctly  presented  in  Algorithm 
2,  which  efficiently  computes  a  feasible  path.  This  is  accom¬ 
plished  by  removing  those  strings  that  reach  the  goal  via  the 
newly  detected  obstacle  states  from  the  language  generated  by 
any  state  of  the  PFSA.  The  measure  vector  is  consequently 
updated  by  removing  those  strings  from  the  original  optimal  lan¬ 
guage  generated  by  the  states  of  the  PFSA,  i.e.,  by  subtracting  the 
measure  of  such  strings  (see  Definition  2.5)  from  the  original 


language  measure.  The  resulting  path  improves  monotonically  in 
the  replanning  time  horizon  because  the  estimate  of  the  measure 
of  such  strings  improves  monotonically. 

Algorithm  1.  Distributed  updating  of  measure  vector  for  plan¬ 
ning  in  the  presence  of  disturbances 


Require:  Gnav(C?,  2,  <5,  k,  x),  0,  Pg 
Ensure:  v*  ,n* ,  Pq 
1:  Initialize  Vij,  S  Q,  r'Jj  =  0. 

2:  P^r  =  Pg 
3:  = 

/  *  *  Begin  infinite  asynchronous  loop  *  *  / 

4:  while  true  do 

5:  for  each  cell  qi  G.  Q  do 

6:  if  7V((?/)  7^  0  then 

7:  for  each  node  qj  G  N{qi)  do 

8:  Query  and 

/  *  *  Enabling  and  Disabling  transitions  for  Control  *  */ 
/  *  *  Compare  measure  with  neighbors  and  allow 
transitions  only  to  the  better  neighbors  *  */ 

9:  if  <  Uq  then 

10:  if  Ea:«€V(,,)  PgC'?.,  k)  ^  0  then 

11:  PGW.',i,',:)  =  Pg(‘?,',^m:)  +  Pg"'(‘?/A;,:) 

12:  PG^.Sy,  :)  =  0 

13:  T^{qi,Si)  =  Si)  +  i/) 

14:  Ti{qi,  Sj)  =  0 

15:  end  if 

/*  *  where,  si)  =  qi  &  sj)  =  qj  *  */ 

/  *  *  Disabling  *  */ 


else 

■f  T.to.eNM  Pg(?;,  sj,  k)  ==  0  then 
PGfe..y.:)  =  P£"’(A?,-,.v„:) 

Pg(?.A/,  :)  =  PgC-I.A/,  :)  -  Pc^feA;, :) 

Ti{qi,Si)  =  n{qi,Si)  -  (<?,-, i;) 

n{qi,Sj)  = 

/  *  *  Enabling  *  */ 

end  if 
end  if 

=  EA:«€lVh,0  (1  -  0)PG('?;,-5/t,  +  Ox! 

I  *  *  Node  updating  *  */ 

end  for 
end  if 
end  for 
end  while 


Algorithm  2.  Updating  of  measure  vector  for  replanning 

Require:  Gnav(C,  <5,  n*,  x),0,  Pg,  v*,  |r| 

Ensure: 

1 :  Update  x  with  new  information  about  obstacles 
2:  Initialize  vo  —  v*  &  n  —  I 
3:  P^™  =  Pg  and  7t* 

4:  while  ii  <  |r|  do 

5:  Lines  4  through  24  of  Algorithm  1 

6:  «  =  ;!  +  ! 

7:  end  while 


Remark  5.1.  The  time  complexity  of  Algorithm  1  is 
Ti  ~  0(|2| |A(a/)P),  for  a  PFSA  with  \Q\  states  and  |iV((7)|  is  the 
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X-Coordinate 

Fig.  4  Replanning  with  \T\=2  for  different  characteristic 
weights 

maximum  neighborhood  size  of  the  PFSA  states  [18].  The  time 
complexity  of  Algorithm  2  is  T2  ~  0(|7’||A(^)|),  where  T  is  the 
finite-time  horizon  of  replanning. 

6  Interpretation  of  the  Results 

This  section  presents  pertinent  results  of  AUV  simulation  for 
the  proposed  method  of  path  planning  in  the  presence  of  uncer¬ 
tainties  due  to  ocean  currents.  To  justify  the  usage  of  a 
disturbance-aware  path  planning  algorithm  for  AUVs  and  to  dem¬ 
onstrate  the  algorithm’s  efficacy  to  react  to  different  types  of  dis¬ 
turbances,  Fig.  2  shows  the  (simulated)  optimal  paths  of  an  AUV 
that  is  navigated  from  point  A  to  point  5  in  a  similar  environment 
with  three  different  directions  of  the  ocean  current.  The  optimal 
nominal  paths  in  these  three  cases  are  significantly  different, 
because  the  paths  are  calculated  to  accommodate  the  disturbances 
due  to  the  ocean  current. 

During  actual  navigation,  the  AUV  is  likely  to  deviate  from  the 
nominal  trajectory;  however,  it  finds  a  path  from  any  arbitrary 
state  to  the  terminal  state  by  following  the  gradient  of  the  opti¬ 
mized  measure  vector.  The  nominal  trajectory  in  each  of  the  three 
plates  in  Fig.  2  shows  the  shortest  safe  path  based  on  the  expected 
deviations  due  to  the  disturbances. 

Figure  3  presents  a  scenario,  where  the  environment  is  only  par¬ 
tially  known  a  priori  and  an  unforeseen  obstacle  at  point  D  is 
observed  when  the  AUV  reaches  point  C.  Consequently,  the  origi¬ 
nal  plan  (i.e.,  without  the  knowledge  of  the  obstacle  at  D)  is 
altered  to  avoid  a  potential  collision.  In  Fig.  3,  the  uncontrollable 
transitions  (that  are  due  to  the  ocean  current)  tend  to  push  the  ve¬ 
hicle  upstream  and  path  replanning  attempts  to  bypass  the  obsta¬ 
cle  at  D\  however,  the  amount  of  time  available  for  online 
replanning  is  limited.  Algorithm  2  is  iteratively  executed  to 
improve  the  replanning  monotonically  with  an  increase  in  the 
span  |r|  of  time  horizon  T.  As  seen  for  |r|  =  2, 5,  and  10  in  Fig. 
3,  the  respective  replans  are  generated  to  navigate  the  AUV  fur¬ 
ther  down  so  that  even  if  the  vehicle  is  drifted  upwards  by  the 
ocean  current,  it  would  still  be  able  to  avoid  a  collision  with  the 
obstacle.  By  increasing  the  replanning  time  horizon  from  |r|  =  2 
to  |T|  =  10,  the  plan  becomes  more  robust  to  collision  as  it 
attempts  to  keep  the  vehicle  clear  of  the  obstacle  by  a  larger  dis¬ 
tance  by  optimizing  the  margin  of  error  due  to  the  ocean  current- 
induced  disturbances.  A  small  replanning  time  horizon  results  in 
local  greedy  search  of  the  configuration  space;  it  limits  the  effects 
of  the  new  obstacle  to  a  narrow  region  around  it.  A  smaller  replan¬ 
ning  time  horizon  also  limits  the  expected  deviation  of  the  vehicle 
due  to  disturbances. 

A  global  optimal  policy  could  be  significantly  different  from  a 
local  greedy  solution  obtained  in  limited  time  as  seen  in  Fig.  3, 
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where  the  replanned  path  for  |r|  =  15  is  different  from  those  for 
|T|  =  2,  5,  and  10.  While  Algorithm  2  makes  local  perturbations 
in  the  initial  plan  to  keep  the  AUV  clear  of  an  observed  obstacle 
at  D,  a  sufficiently  large  replanning  time  horizon  (e.g.,  |r|  =  15  in 
Fig.  3)  generates  an  optimal  path  such  that  the  local  changes  in 
the  replanned  path  may  finally  cause  convergence  to  a  global  opti¬ 
mal  path. 

As  the  original  optimal  stochastic  navigation  matrix  contains 
information  of  the  disturbance,  even  a  small  replanning  time  hori¬ 
zon  could  keep  the  AUV  clear  of  the  obstacle  in  the  presence  of 
disturbances;  however,  robustness  of  the  plans  improve  monotoni¬ 
cally  with  increase  in  the  replanning  time  horizon  parameter  |r|. 
In  essence,  a  choice  of  |T|  depends  on  the  in  situ  computational 
capabilities  of  the  AUV  and  the  time  available  for  online  updating 
of  the  path  plan,  which  may  depend  on  the  AUV’s  speed  and 
observation  window. 

For  time-critical  operations  with  limited  computational  capabil¬ 
ities,  a  goal  of  replanning  is  to  ensure  safety  while  maintaining  an 
accepted  level  of  performance  (e.g.,  path  length).  This  trade-off 
could  be  achieved  in  the  language-measure-theoretic  framework 
by  selecting  the  characteristic  weight  of  new-obstacle  states  dur¬ 
ing  replanning,  where  an  increased  penalty  on  collision  with  an 
obstacle  would  enhance  safety  at  the  expense  of  a  longer  deviation 
from  the  original  plan.  This  is  achieved  by  having  the  characteris¬ 
tic  weights  of  the  newly  observed  obstacles  states  as  ^f*Zobs’ 
where  the  multiplicative  constant  A'  is  a  positive  real  parameter, 
while  all  other  characteristic  weights  remain  unchanged.  Figure  4 
shows  the  effects  of  changing  the  characteristic  weight  vector  for 
the  case  |r|  =  2  in  Fig.  3.  As  seen  in  Fig.  4,  the  safety  margin  is 
enhanced  by  increasing  K,  which  is  capable  of  accommodating 
larger  anomalous  behavior  of  the  robot  in  the  presence  of  ocean 
currents  at  the  expense  of  an  increased  path  length.  For  a  family 
of  admissible  values  of  K,  the  convex  hull  of  the  generated  solu¬ 
tions  would  lead  to  a  Pareto-optimal  front  representing  trade-off 
between  safety  and  performance.  For  any  particular  real-time 
operation,  the  choice  of  the  parameter  K  is  dependent  on  computa¬ 
tional  capabilities  and  time  available  for  replanning  as  well  as 
strength  of  disturbances. 

7  Summary,  Conclusions,  and  Future  Work 

This  paper  presents  a  generalized  framework  for  robust  path 
planning  of  AUVs  in  the  presence  of  environmental  disturbances 
and  actuation  errors.  Concepts  of  recently  developed  language 
measure-based  optimization  [10,11]  have  been  applied  to  demon¬ 
strate  successful  navigation  of  an  autonomous  robot  moving  in  an 
uncertain  and  partially  known  environment.  The  algorithm  is  ro¬ 
bust  to  both  modeling  and  environmental  uncertainties;  however, 
these  uncertainties  must  be  bounded.  The  efficacy  of  the  proposed 
concept  is  demonstrated  by  numerical  simulation  on  a  test  bed  of 
an  AUV  moving  in  the  presence  of  ocean  currents  and  unforeseen 
obstacles.  The  paper  is  summarized  below: 

1 .  It  presents  a  path  planning  philosophy  that  is  fundamentally 
different  from  the  ones  reported  in  literature. 

2.  It  performs  online  adaptation  of  the  planning  algorithms  to 
unanticipated  obstacles  in  the  presence  of  disturbances. 

Many  path  planning  algorithms,  reported  in  the  current  litera¬ 
ture,  may  not  be  able  to  efficiently  handle  collision  avoidance  in 
real  time  in  the  presence  of  disturbances,  primarily  due  to  over¬ 
whelming  computational  requirements.  It  is  shown  in  this  paper 
through  numerical  simulations  that  the  language-measure-theo¬ 
retic  algorithm  can  handle  unanticipated  obstacles  in  real  time  in 
the  presence  of  disturbances.  However,  actual  performance  com¬ 
parisons  with  state-of-the-art  planning  are  yet  to  be  demonstrated 
by  experimental  validation. 

While  there  are  numerous  research  areas  for  robot  path  plan¬ 
ning  based  on  language-measure-theoretic  concepts,  the  following 
topics  are  recommended  for  future  research. 
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1.  Construction  of  a  biased  sampling  scheme  while  updating 
the  language  measure  of  nodes  in  a  distributed  fashion  to 
enhance  the  quality  of  path  planning  and  to  mitigate  the 
computational  cost.  Such  a  scheme  is  important  for  a  trade¬ 
off  between  exploration  and  exploitation. 

2.  Extension  of  the  proposed  method  of  robot  path  planning 
under  time  constraints  as  well  as  by  minimizing  the  energy 
consumption. 

3.  Validation  of  the  proposed  method  by  laboratory  experimen¬ 
tation  on  a  networked  robotic  test  bed  with  nonholonomic 
constraints  [8]. 

4.  Performance  comparison  with  other  methods  like  those  pre¬ 
sented  in  Refs.  [5-7]. 
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